Detecting Errors and Imputing Missing Data for Single Loop Surveillance Systems

نویسندگان

Chao Chen

Jaimyoung Kwon

Alexander Skabardonis

Pravin Varaiya

چکیده

Single loop detectors provide the most abundant source of traffic data in California, but loop data samples are often missing or invalid. We describe a method that detects bad data samples and imputes missing or bad samples to form a complete grid of ‘clean data’, in real time. The diagnostics algorithm and the imputation algorithm that implement this method are operational on 14,871 loops in six Districts of the California Department of Transportation. The diagnostics algorithm detects bad (malfunctioning) single loop detectors from their volume and occupancy measurements. Its novelty is its use of time series of many samples, rather than basing decisions on single samples, as in previous approaches. The imputation algorithm models the relationship between neighboring loops as linear, and uses linear regression to estimate the value of missing or bad samples. This gives a better estimate than previous methods because it uses historical data to learn how pairs of neighboring loops behave. Detection of bad loops and imputation of loop data are important because they allow algorithms that use loop data to perform analysis without requiring them to compensate for missing or incorrect data samples. Chen/Kwon/Skabardonis/Varaiya 3 INTRODUCTION Loop detectors are the best source of real time freeway traffic data today. In California, these detectors cover most urban freeways. Loop data provide a powerful means to study and monitor traffic (2). But the data contain many holes (missing values) or bad (incorrect) values and require careful ‘cleaning’ to produce reliable results. Bad or missing samples present problems for any algorithm that uses the data for analysis. Therefore, we need both to detect when data are bad and throw them out, and to ‘fill’ holes in the data with imputed values. The goal is to produce a complete grid of reliable data. We can trust analyses that use such a complete data set. We need to detect bad data from the measurements themselves. The problem was studied by the FHWA, Washington DOT, and others. Existing algorithms usually work on the raw 20-second or 30second data, and produce a diagnosis for each sample. But it’s very hard to tell if a single 20-second sample is good or bad unless it’s very abnormal. Fortunately, loop detectors don’t just give random errors—some loops produce reasonable data all the time, while others produce suspect data all the time. By examining a time series of measurements one can readily distinguish bad behavior from good. Our diagnostics algorithm examines a day’s worth of samples together, producing convincing results. Once bad samples are thrown out, the resulting holes in the data must be filled with imputed values. Imputation using time series analysis has been suggested before, but these imputations are only effective for short periods of missing data; linear interpolation and neighborhood averages are natural imputation methods, but they don’t use all the relevant data that are available. Our imputation algorithm estimates values at a detector using data from its neighbors. The algorithm models each pair of neighbors linearly, and fits its parameters on historical data. It is robust, and performs better than other methods. We first describe the data and types of errors that are observed. We then survey current methods of error detection, which operate on single 20-second samples. Then we present our diagnostic algorithm, and show that it performs better. We then present our imputation algorithm, and show that this method is better than other imputation methods such as linear interpolation. DESCRIPTION OF DATA The freeway Performance Measurement System (PeMS) (1,2) collects, stores, and analyzes data from thousands of loop detectors in six districts of the California Department of Transportation (Caltrans). The PeMS database currently has 1 terabyte of data online, and collects more than 1GB per day. PeMS uses the data to compute freeway usage and congestion delays, measure and predict travel time, evaluate ramp-metering methods, and validate traffic theories. There are 14,871 main line (ML) loops in the PeMS database from six Caltrans districts. The results presented here are for main line loops. Each loop reports the volume q(t)—the number of vehicles crossing the loop detector during a 30-second time interval t, and occupancy k(t)—the fraction of this interval during which there is a vehicle above the loop. We call each pair of volume and occupancy observations a sample. The number of total possible samples in one day from ML loops in PeMS is therefore (14871 loops) x (2880 sample per loop per day) = 42 million samples. In reality, however, PeMS never receives all the samples. For example, Los Angeles has a missing sample rate of about 15%. While it’s clear when we miss samples, it’s harder to tell when a received sample is bad or incorrect. A diagnostics test needs to accept or reject samples based on our assumption of what good and bad samples look like. Chen/Kwon/Skabardonis/Varaiya 4 EXISTING DATA RELIABILITY TESTS Loop data error has plagued their effective use for a long time. In 1976, Payne (3) identified five types of detector errors and presented several methods to detect them from 20-second and 5-minute volume and occupancy measurements. These methods place thresholds on minimum and maximum flow, density, and speed, and declare a sample to be invalid if they fail any of the tests. Later, Jacobsen and Nihan at the University of Washington defined an ‘acceptable region’ in the k-q plane, and declared samples to be good only if they fell inside the region (4). We call this the Washington Algorithm. The boundaries of the acceptable region are defined by a set of parameters, which are calibrated from historical data, or derived from traffic theory. Existing detection algorithms (3,4,5) try to catch the errors described in (3). For example, ‘chattering’ and ‘pulse break up’ cause q to be high, so a threshold on q can catch these errors. But some errors cannot be caught this way, such as a detector stuck in the ‘off’ (q=0, k=0) position. Payne’s algorithm would identify this as a bad point, but good detectors will also report (0,0) when there are no vehicles in the detection period. Eliminating all (0,0) points introduces a positive bias in the data. On the other hand, the Washington Algorithm accepts the (0,0) point, but doing so makes it unable to detect the ‘stuck’ type of error. A threshold on occupancy is similarly hard to set. An occupancy value of 0.5 for one 30-second period should not indicate an error, but a large number of 30-second samples with occupancies of 0.5, especially during non-rush hours, points to a malfunction. We implemented the Washington Algorithm in Matlab and tested it on 30-second data from 2 loops in Los Angeles, for one day. The acceptable region is taken from (4). The data and their diagnoses are shown in Figure 1. Visually, loop 1 looks good (Figure 1b), and loop 2 looks bad (Figure 1d). Loop 2 looks bad because there are many samples with k=70% and q=0, as well as many samples with occupancies that appear too high, even during non-rush hours, and when loop 1 shows low occupancy. The Washington Algorithm, however, does not make the correct diagnosis. Out of 2875 samples, it declared 1138 samples to be bad for loop 1 and 883 bad for loop 2. In both loops, there were many false alarms. This is because the maximum acceptable slope of q/k was exceeded by many samples in free flow. This suggests that the algorithm is very sensitive to thresholds and needs to be calibrated for California. Calibration is impractical because each loop will need a separate acceptable region, and ground truth would be difficult to get. There are also false negatives–many samples from loop 2 appear to be bad because they have high occupancies during off peak times, but they were not detected by the Washington Algorithm. This illustrates a difficulty with the threshold method—the acceptable region has to be very large, because there are many possible traffic states within a 30-second period. On the other hand, a lot more information can be gained by looking at how a detector behaves over many sample times. This is why we easily recognize loop 1 to be good and loop 2 to be bad by looking at their k(t) plots, and this is a key insight that led to our diagnostics algorithm. PROPOSED DETECTOR DIAGNOSTICS ALGORITHM Design The algorithm for loop error detection uses the time series of flow and occupancy measurements, instead of making a decision based on an individual sample. It is based on the empirical observation that good and bad detectors behave very differently over time. For example, at any given instant, the flow and occupancy at a detector location can have a wide range of values, and one cannot rule most Chen/Kwon/Skabardonis/Varaiya 5 of them out; but over a day, most detectors show a similar pattern—flow and occupancy are high in the rush hours and low late at night. Figure 2a and 2b show typical 30-second flow and occupancy measurements. Most loops have outputs that look like this, but some loops behave very differently. Figure 2c and 2d show an example of a bad loop. This loop has zero flow and an occupancy value of 0.7 for several hours during the evening rush hour—clearly, these values must be incorrect. We found 4 types of abnormal time series behavior, and list them in Table 1. Types 1 and 4 are selfexplanatory; types 2 and 3 are illustrated in Figure 2c, 2d, and Figure 1b. The errors in Table 1 are not mutually exclusive. For example, a loop with all zero occupancy values exhibits both type 1 and type 4 errors. A loop is declared bad if it is in any of these categories. We did not find a significant number of loops that have chatter or pulse break up, which would produce abnormally high volumes. Therefore the current form of the detection algorithm does not check for this condition. However, a fifth error type and error check can easily be added to the algorithm to flag loops with consistently high counts. We developed the Daily Statistics Algorithm (DSA) to recognize error types 1-4 above. The input to the algorithm is the time series of 30-second measurements q(d,t) and k(d,t), where d is the index of the day, and t=0,1,2,...,2879 is the 30-second sample number; the output is the diagnosis ∆(d) for the dth day: ∆(d)=0 if the loop is good, and ∆(d)=1 if the loop is bad. In contrast to existing algorithms that operate on each sample, the DSA produces one diagnosis for all the samples of a loop on each day. We use only samples between 5am and 10pm to do the diagnostics, because outside of this period, it’s more difficult to tell the difference between good and bad loops. There are 2041 30-second samples in this period, therefore the algorithm is a function of 2041x2=4082 variables. Thus the diagnostic ∆(d) on day d is a function, ∆(d) = f(q(d,a), q(d,a+1),..., q(d,b), k(d,a), k(d,a+1),..., k(d,b)), where a=5*120=600 is the sample number at 5am, and b=22*120=2640 is the last sample number, at 10pm. To deal with the large number of variables, we first reduce them to four statistics, S1,...,S4, which are appropriate summaries of the time series. Their definitions are given in Table 2, where Sj(i,d) is the jth statistic computed for the ith loop on the dth day. The decision ∆ becomes a function of these four variables. For the ith loop and dth day, the decision whether the loop is bad or good is determined according to the rule

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...

متن کامل

Selection of Variables that Influence Drug Injection in Prison: Comparison of Methods with Multiple Imputed Data Sets

Background: Prisoners, compared to the general population, are at greater risk of infection. Drug injection is the main route of HIV transmission, in particular in Iran. What would be of interest is to determine variables that govern drug injection among prisoners. However, one of the issues that challenge model building is incomplete national data sets. In this paper, we addressed the process ...

متن کامل

The roles of nearest neighbor methods in imputing missing data in forest inventory and monitoring databases

Almost universally, forest inventory and monitoring databases are incomplete, ranging from missing data for only a few records and a few variables, common for small land areas, to missing data for many observations and many variables, common for large land areas. For a wide variety of applications, nearest neighbor (NN) imputation methods have been developed to fill in observations of variables...

متن کامل

Simple nuclear norm based algorithms for imputing missing data and forecasting in time series

There has been much recent progress on the use of the nuclear norm for the so-called matrix completion problem (the problem of imputing missing values of a matrix). In this paper we investigate the use of the nuclear norm for modelling time series, with particular attention to imputing missing data and forecasting. We introduce a simple alternating projections type algorithm based on the nuclea...

متن کامل

Error assessment in man-machine systems using the CREAM method and human-in-the-loop fault tree analysis

Background and Objectives: Despite contribution to catastrophic accidents, human errors have been generally ignored in the design of human-machine (HM) systems and the determination of the level of automation (LOA). This paper aims to develop a method to estimate the level of automation in the early stage of the design phase considering both human and machine performance. Methods: A quantita...

متن کامل

Empirical Evaluation of Imputation Methods on Quarterly Census of Employment and Wages (QCEW) Data

The U.S. Bureau of Labor Statistics’ Quarterly Census of Employment and Wages (QCEW) program currently uses each establishment’s year-ago trend in imputing missing employment and wages data. Ratio method is introduced which is using current trend of employment and wages. An empirical evaluation of well established methods, namely ratio and nearest–neighbor, is undertaken. This paper presents th...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Detecting Errors and Imputing Missing Data for Single Loop Surveillance Systems

نویسندگان

چکیده

منابع مشابه

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

Selection of Variables that Influence Drug Injection in Prison: Comparison of Methods with Multiple Imputed Data Sets

The roles of nearest neighbor methods in imputing missing data in forest inventory and monitoring databases

Simple nuclear norm based algorithms for imputing missing data and forecasting in time series

Error assessment in man-machine systems using the CREAM method and human-in-the-loop fault tree analysis

Empirical Evaluation of Imputation Methods on Quarterly Census of Employment and Wages (QCEW) Data

عنوان ژورنال:

اشتراک گذاری